Automatically formulating and solving prediction problems

نویسندگان

  • Benjamin Schreck
  • Kalyan Veeramachaneni
چکیده

In this paper, we designed a formal language, called Trane, for describing prediction problems over relational datasets, as well as implemented a system that allows data scientists to specify problems in that language. We show that this language is able to describe prediction problems across many different domains, including those on KAGGLEa data science competition website. We express 29 different KAGGLE problems in this language. We designed an interpreter, which translates input from the user, specified in this language, into a series of transformation and aggregation operations to apply to a dataset in order to generate labels that can be used to train a supervised machine learning classifier. Using a smaller subset of this language, we developed a system to automatically enumerate, interpret and solve prediction problems. We tested this system on the Walmart Store Sales Forecasting dataset found on KAGGLE [1] by enumerating 1077 prediction problems and then building models that attempted to solve them, for which we produced 235 AUC scores. Considering that only one out of those 1077 problems was the focus of a 2.5 month-long competition on KAGGLE, we expect this system to deliver a thousandfold increase in data scientists’ productivity.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Representations of Problems of Reasoning about Actions

The purpose of this paper is to clarify some basic issues of choice of representation for problems of reasoning about actions. The general problem of rePresentation is concerned with the relationship between different ways of formulating a problem to a problem solving system and the efficiency with which the system can be expected to find a solution to the problem. An understanding of the relat...

متن کامل

Reformulating Planning Problems by Eliminating Unpromising Actions

Despite a big progress in solving planning problems, more complex problems still remain hard and challenging for existing planners. One of the most promising research directions is exploiting knowledge engineering techniques such as (re)formulating the planning problem to be easier to solve for existing planners. In particular, it is possible to automatically gather knowledge from toy planning ...

متن کامل

An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set

Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...

متن کامل

Formulating and Solving Nonlinear Programs as Mixed Complementarity Problems?

We consider a primal-dual approach to solve nonlinear programming problems within the AMPL modeling language, via a mixed complementarity formulation. The modeling language supplies the rst order and second order derivative information of the Lagrangian function of the nonlinear problem using automatic diierentiation. The PATH solver nds the solution of the rst order conditions which are genera...

متن کامل

A logical framework for commonsense predictions of solid object behaviour

Predicting the behaviour of a qualitatively described system of solid objects requires a combination of geometrical, temporal, and physical reasoning. Methods based upon formulating and solving differential equations are not adequate for robust prediction, since the behaviour of a system over extended time may be much simpler than its behaviour over local time. This paper presents a first-order...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016